perm filename MACHR[4,KMC] blob sn#022483 filedate 1973-01-31 generic text, type T, neo UTF8
00100	COLBY ENEA AND MORAVEC
00200	
00300	 CONTEXT-SENSITIVE LANGUAGE-RECOGNITION FOR IDIOLECTIC   COMPUTER
00400	UNDERSTANDING OF TELETYPED NATURAL LANGUAGE DIALOGUES
00500	
00600	
00610	
00700		Why is it so difficult for  machines  to  understand  natural
00800	language? Perhaps it is because machines do not simulate sufficiently
00900	what humans  do  when  humans  process  language.  Several  years  of
01000	experience  with  computer  science  and  linguistic  approaches have
01100	taught us the  scope  and  limitations  of  syntactic and semantic 
01200	parsers.[Thorne       &       Bratley]      [Simmons
01300	[Schank][Woods][Winograd].   While  extant linguistic  parsers  perform
01400	satisfactorily  with  carefully  edited  text sentences or with small
01500	dictionaries , they  are  unable  to  deal  with  everyday  language
01600	behavior  characteristic  of human conversation.  In a 
01700	rationalistic quest for certainty and attracted by  an  analogy  from
01800	the   proof   theory   of  logicians in   which  provability  implied
01900	computability, computational linguists hoped to develop  
02000	formalisms  for  natural language grammars. But the hope has not been
02100	realized and perhaps in principle cannot be.    (It is  difficult  to
02200	formalize  something  which  can  hardly  be  formulated).   In their
02300	dialogues  humans    are   never   context-free   linguistically   or
02400	conceptually.           The   main  problem  is  how  to  model  this
02500	context-sensitivity.
02600	
02700		Linguistic    parsers    use    morphographemic     analyses,
02800	parts-of-speech  assignments  and  dictionaries  containing  multiple
02900	word-senses each possessing semantic features, programs or rules for  restricting  word
03000	combinations.   Such parsers perform a word-by-word analysis of every
03100	word,  valiantly  disambiguating  at  each  step  in  an  attempt  to
03200	construct   a   meaningful   interpretation.       While  it  may  be
03300	sophisticated computationally, a conventional parser is  quite  at  a
03400	oss to deal with the  oddments  of  ordinary  conversation.   In everyday
03500	discourse people speak colloquially and idiomatically using all sorts
03600	of pat phrases,  slang and cliches. The number of special-case expressions is indefinitely large.
03700	Humans are  cryptic  and  elliptic.  They
03800	lard even their written expressions with meaningless fuzz 
03900	and fragments.They convey  their  intentions
04000	and  ideas  in  idiosyncratic  and  metaphorical ways, blithely
04100	violating rules of 'correct' grammar and  syntax.        Given  these
04200	difficulties,  how  is  it  that people carry on conversations easily
04300	most of the time while machines have found it extremely difficult  to
04400	continue to make   appropriate  replies indicating    
04500	understanding.
04600	
04700	
04800	
04900	     	It seems that people  'get  the  message '     without  analyzing
05000	every single word in the input. They even ignore some  of its terms.
05100	People make individualistic and idiosyncratic selections from  highly
05200	redundant  and repetitious communications.     These  personal
05300	selective  operations, based on idiosyncratic intentions, product  a  transformation  of  the  input  by
05400	destroying  and  even  distorting information.  In speed reading, for
05500	example, only a small percentage of contentive  words  on  each  page
05600	need  be  looked  at.  These  words somehow resonate with the readers
05700	relevant conceptual-inferential structure whose processes enable  him
05800	to  'understand' not simply the language but all sorts of unmentioned
05900	aspects about the situations and events being   referred  to  in  the
06000	language.      In  written texts up to 5/6 of the input can be distorted or
06100	deleted and the intended message can still successfully be extracted.
06200	Spoken  conversations  in  English  are  known  to  be  at least  50%
06300	redundant.  Half the words can be garbled and  listeners  nonetheless
06400	get  the  gist  or  draft  of  what  is  being  said.   (Give further
06500	experimental evidence here)
06600	
06700		To approximate such  human  achievements  we  require  a  new
06800	perspective and a practical method which differs from that of current
06900	linguistic approaches.    This alternate approach should  incorporate
07000	those  aspects of parsers which have been found to work  well, e.g.detecting embedded
07100	clauses. Also individualistic features characteristic of an  idiolect
07200	should  have dominant emphasis. Parsers represent complex and refined
07300	algorithms.  While on one hand they subject a sentence to a  detailed
07400	and  sometimes  overkilling  analysis, on the other, they are finicky
07500	and oversensitive.   A conventional parser simply halts if a word  in
07600	the  input  sentence  is  not  present  in  its  dictionary. It finds
07700	ungrammatical expressions such as double prepositions (`Do  you  want
07800	to  get  out  of  from  the  hospital?')  quite  confusing.   Parsers 
07900	constitute  a  tight  conjunction  of  tests  rather  than  a   loose
08000	disjunction. As more and more tests are added to the conjunction, the
08100	parser behaves   like  a  finer  and  finer  filter which makes it IT
08200	increasingly difficult for an expression to pass through.  Parsers do
08300	not allow forununderstandings and misunderstandings typical of        
08400	everyday human dialogues.
08500		Finally, it is difficult to keep consistent a  dictionary  of
08600	over 500 multiple-sense words classified by binary semantic features or rules. For
08700	example, suppose a noun (Ni) is used by some verbs as a direct object
08800	in  the  semantic sense of a physical object. Then it is noticed that
08900	Ni is also used by  other  verbs  in  the  sense  of  a  location  so
09000	`location'  is  added  to  Ni's  list  of  semantic  features. Now Ni
09100	suddenly qualifies as a direct object for a lot of other  verbs.  But
09200	the resultant combinations make no sense even in an idiolect. If a special feature is then
09300	created for Ni, then one  loses  the  power  of  general  classes  of
09400	semantic  features.   Adding  a single semantic feature can result in
09500	the propagation of hidden inconsistencies and unwanted  side-effect..
09600	as   the  dictionary  grows  it  becomes  increasingly  unstable  and
09700	difficult to control.
09800	
09900		On intuitive grounds it is hardly credible that  conventional
10000	parsers  model the mechanisms people use in processing language.  As 
10100	Chomsky[ ] has remarked, `We noted at the outset that performance and
10200	competence  must  be  sharply distinguished if either is to be studied
10300	successfully. We have now described a certain model of competence.It     
10400	would  be  tempting,  but  quite  absurd,  to regard it as a model of
10500	performance as well.     Thus we might  propose  that  to  produce  a
10600	sentence   the   speaker   goes   through  the  successive  steps  of
10700	constructing a base-derivation, line by line from the initial  symbol
10800	S,   then   inserting   lexical   items   and   applying  grammatical
10900	transformations to form a surface structure, and finally applying the
11000	phonological  rules  in  their  given  order,  in accordance with the
11100	cyclic principle discussed above.      There  is  not  the  slightest
11200	justification for any such assumption.' It should be clear from these
11300	strictures that the transformational approach has been concerned with
11400	production rather than interpretation of sentences and that it is not
11500	oriented towards human performance but towards an  idealized  grammar
11600	of competence.
11700	
11800		Early  attempts  to develop a pattern-matching approach using
11900	special-purpose heuristics have been described  by  Colby,  Watt  and
12000	Gilbert  [ ], Weizenbaum[ ] and Colby and Enea[ ]. The limitations of
12100	these attempts are well known to workers in artificial  intelligence.
12110	The man-machine conversations of such programs soon becomes impoverished and boring.
12200	Such  primitive  context-restricted programs often grasp a topic well
12300	enough but too often do not understand quite what is being said about
12400	the  topic, with amusing or disastrous consequences. This shortcoming
12500	is a  consequence of the  limitations  of  a  pattern- matching 
12600	approach lacking  a rich conceptual structure into which the
12700	pattern abstracted from the input can be matched for inferencing.
12800	The programs need  a subroutine structure, both pattern directed and
12900	specific, for generalizations. 
13000		Winograd`s  program  ,while  limited  to  a  few  objects and
13100	relations in a toy  robotic  world,  represented  an  improvement  in
13200	handling  dialogues. He  understandably chose not to face the problem of
13300	multiple word-senses, idioms fragments and figurative expressions.      Another
13400	pattern-matching  approach is that of Wilks [ ] working in the area of
13500	machine translation.  His algorithm constructs a pattern from English
13600	text input which is matched against templates in an interlingual data
13700	base from which,in turn, French output is generated without  using  a
13800	generative grammar.
13900	
14000		In  the  course  of  constructing an interactive simulation of
14100	paranoia we were faced with  the  problem  of  dealing  with  natural
14200	language  as  it  is  used  in  the  doctor-patient  situation  of  a
14300	psychiatric interview.This domain of  discourse  admittedly  contains
14400	many  stereotypes    and is
14500	constrained in topics (Newton`s laws are rarely discussed). But it is
14600	rich enough   in  verbal  behavior  to  be  a challenge to a language
14700	understanding algorithm since a  variety  of  human  experiences  are
14800	discussed in this domain including the interpersonal relation  which develops
14900	between the interview participants. A look at the contents of a thesaurus 
14910	reveals that words relating to people and their interelations make up 70% of language.
14920		The judgement of paranoia is
15000	made  by  psychiatrists  relying mainly on the verbal behavior of the
15100	interviewed patient.   If a paranoid model  is  to  exhibit  paranoid
15200	behavior  in  a psychiatric interview, it must be capable of handling
15300	dialogues typical of the doctor-patient context.      Since the model
15400	can communicate only through teletyped messages,the vis-a-vis aspects
15500	of the usual psychiatric  interview  are  absent.    Thus  the  model
15600	should be able to deal with typewritten natural language input and to
15700	output replies which are indicative of an underlying paranoid thought
15800	process in the context of a psychiatric interview.
15900	
16000		In  an  interview there is always a who saying something to a
16100	whom with definite intentions  and  expectations.     There  are  two
16200	situations  to  be taken into account, the one being talked about and
16300	the one the participants are in. Sometimes  the  latter  becomes  the
16400	former.  Participants  in  dialogues  have  intentions and   dialogue
16500	algorithms must take this into account.  The doctor's intention is to
16600	gather  certain kinds of information while the patient's intention is
16700	to give information and get help.    A job is to be done; it  is  not
16800	small  talk.   Our working hypothesis is that each participant in the
16900	dialogue understands  the  other  by  matching  selected idiosyncratically-
17000	significant  patterns  in the input against conceptual patterns which
17100	contain information about the  situation  or  event  being  described
17200	linguistically.              This   understanding   is   communicated
17300	reciprocally  by  linguistic  responses  judged  appropriate  to  the
17400	intentions   and   expectations   of  the  participants  and  to  the
17500	requirements of the situation. In this paper we shall  describe  only
17600	the  context-sensitive  processes  used  to  extract  a  pattern from
17700	natural language input.  In a later communication we shall describe the
17800	inferential  processes  carried  out  at  the conceptual level once a
17900	`paradigmatic' pattern has been  received  from the   input-analysing
18000	processes.
18100	
18200	
18300	(HORACE WRITES UP THE NANLYZER)